Compositions for use in identification of orthopoxviruses

ABSTRACT

Oligonucleotide primers and compositions and kits containing the same for rapid identification of orthopoxviruses by amplification of a segment of viral nucleic acid followed by molecular mass analysis are provided.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation-in-part of U.S. patent applicationSer. No. 10/728,486 filed Dec. 5, 2003, and claims the benefit ofpriority to U.S. Provisional Application Ser. No. 60/604,329 filed Aug.24, 2004, each of which is incorporated herein by reference in itsentirety.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with United States Government support underDARPA/SPO contract BAA00-09. The United States Government may havecertain rights in the invention.

FIELD OF THE INVENTION

The present invention relates generally to the field of geneticidentification and quantification of orthopoxviruses and providesmethods, compositions and kits useful for this purpose, as well asothers, when combined with molecular mass analysis.

BACKGROUND OF THE INVENTION A. Orthopoxviruses

The poxviruses comprise a large family of complex DNA viruses thatinfect both vertebrate and invertebrate hosts. General properties of thePoxvirus family include (a) a large complex virion containing enzymesfor mRNA synthesis, (b) a genome composed of a single lineardouble-strand DNA molecule of 130 to 300 kilobases, and (c) the abilityto replicate within the cytoplasmic compartment of the cell. Thevertebrate poxviruses have been placed into six genera: Orthopoxvirus,Parapoxvirus, Capripoxvirus, Leporipoxvirus, Suipoxvirus, andAvipoxvirus.

Three members of the Orthopoxvirus genus are known to cause disease inhumans. The most notorious member of the Poxvirus family is the variolavirus which, before its eradication, was responsible for smallpox.Cowpox virus and Monkeypox virus also cause disease in humans.Additional members of the Orthopoxvirus genus include: Buffalopox virus,Camelpox virus, Rabbitpox virus, Raccoonpox virus, Volepox virus andEctromeila virus.

B. Bioagent Detection

A problem in determining the cause of a natural infectious outbreak or abioterrorist attack is the sheer variety of organisms that can causehuman disease. There are over 1400 organisms infectious to humans; manyof these have the potential to emerge suddenly in a natural epidemic orto be used in a malicious attack by bioterrorists (Taylor et al.,Philos. Trans. R. Soc. London B. Biol. Sci., 2001, 356, 983-989). Thisnumber does not include numerous strain variants, bioengineeredversions, or pathogens that infect plants or animals.

Much of the new technology being developed for detection of biologicalweapons incorporates a polymerase chain reaction (PCR) step based uponthe use of highly specific primers and probes designed to selectivelydetect individual pathogenic organisms. Although this approach isappropriate for the most obvious bioterrorist organisms, like smallpoxand anthrax, experience has shown that it is very difficult to predictwhich of hundreds of possible pathogenic organisms might be employed ina terrorist attack. Likewise, naturally emerging human disease that hascaused devastating consequence in public health has come from unexpectedfamilies of bacteria, viruses, fungi, or protozoa. Plants and animalsalso have their natural burden of infectious disease agents and thereare equally important biosafety and security concerns for agriculture.

An alternative to single-agent tests is to perform broad-range consensuspriming of a gene target conserved across groups of bioagents.Broad-range priming has the potential to generate amplification productsacross entire genera, families, or, as with bacteria, an entire domainof life. This strategy has been successfully employed using consensus16S ribosomal RNA primers for determining bacterial diversity, both inenvironmental samples (Schmidt et al., J. Bact., 1991, 173, 4371-4378)and in natural human flora (Kroes et al., Proc. Nat. Acad. Sci. (USA),1999, 96, 14547-14552). One drawback of this approach for unknownbioagent detection and epidemiology is that analysis of the PCR productsrequires cloning and sequencing of hundreds to thousands of colonies persample, which is impractical to perform rapidly or on a large number ofsamples.

Conservation of sequence is not as universal for viruses. Large groupsof viral species, however, share conserved protein-coding regions, suchas regions encoding viral polymerases or helicases. Like bacteria,consensus priming has also been described for detection of several viralfamilies, including coronaviruses (Stephensen et al., Vir. Res., 1999,60, 181-189), enteroviruses (Oberste et al., J. Virol., 2002, 76,1244-51; Oberste et al., J. Clin. Virol., 2003, 26, 375-7; and Obersteet al., Virus Res., 2003, 91, 241-8), retroid viruses (Mack et al.,Proc. Natl. Acad. Sci. U.S.A., 1988, 85, 6977-81; Seifarth et al., AIDSRes. Hum. Retroviruses, 2000, 16, 721-729; and Donehower et al., J. Vir.Methods, 1990, 28, 33-46), and adenoviruses (Echavarria et al., J. Clin.Micro., 1998, 36, 3323-3326). However, as with bacteria, there is noadequate analytical method other than sequencing to identify the viralbioagent present.

In contrast to PCR-based methods, mass spectrometry provides detailedinformation about the molecules being analyzed, including high massaccuracy. It is also a process that can be easily automated. DNA chipswith specific probes can only determine the presence or absence ofspecifically anticipated organisms. Because there are hundreds ofthousands of species of benign pathogens, some very similar in sequenceto threat organisms, even arrays with 10,000 probes lack the breadthneeded to identify a particular organism.

There is a need for a method for identification of bioagents which isboth specific and rapid, and in which no culture or nucleic acidsequencing is required.

The present invention provides, inter alia, methods of identifyingunknown viruses, including viruses of the Orthopoxvirus genus. Alsoprovided are oligonucleotide primers, compositions, and kits containingthe oligonucleotide primers, which define orthopoxvirus identifyingamplicons and, upon amplification, produce corresponding amplificationproducts whose molecular masses provide the means to identifyorthopoxviruses at the species and sub-species or strain level.

SUMMARY OF THE INVENTION

The present invention provides, inter alia, primers and compositionscomprising pairs of primers, and kits containing the same for use inidentification of orthopoxviruses. The primers are designed to produceorthopoxvirus identifying amplicons of DNA encoding genes essential toorthopoxvirus replication. The invention further provides compositionscomprising one or more pairs of primers and kits containing the same,which are designed to provide species and sub-species or strain levelcharacterization of orthopoxviruses.

The individual orthopoxvirus primers of the invention are primers thatare 13 to 35 nucleobases in length comprising at least 70% sequenceidentity with any of SEQ ID NOs: 1, 2, 3, 4, 5, 6, 24, 25, 26, 27, 28,and 29. The primer pairs of the invention comprise these same individualprimers in the following combinations: SEQ ID NOs: 1:24, 2:25, 3:26,4:27, 5:28, and 6:29. The kits of the invention can comprise anycombination of the same primer pairs.

The invention also provides methods of using the primer pairs and kitscomprising the same for identification of orthopoxviruses and also fordetermining the presence or absence of an orthopoxvirus in a sample byusing the primer pairs to obtain orthopoxvirus bioagent identifyingamplicons, determining their molecular masses or base compositions andcomparing the molecular masses or base compositions with molecularmasses or base compositions of known orthopoxvirus bioagent identifyingamplicons.

The invention also provides orthopoxvirus bioagent identifying ampliconsobtained by amplification of a segment of a genome of an orthopoxviruswith any of the primer pairs listed above. The orthopoxvirus genomesfrom which orthopoxvirus bioagent identifying amplicons are obtainedinclude, but are not limited to, the GenBank Accession numbers given inTable 3 (vide infra).

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a representative process diagram illustrating a representativeprimer design process.

FIG. 2 is a representative process diagram for identification anddetermination of the quantity of a bioagent in a sample.

FIG. 3 is a pseudo 4-D plot of base compositions of orthopoxvirusesobtained with primer pair number 299.

FIG. 4 is a pseudo 4-D plot of base compositions of orthopoxvirusesobtained with primer pair number 297.

DETAILED DESCRIPTION OF EMBODIMENTS

The present invention provides, inter alia, methods for detection andidentification of orthopoxviruses in an unbiased manner usingorthopoxvirus identifying amplicons. Intelligent primers are selected tohybridize to conserved sequence regions of nucleic acids derived from anorthopoxvirus and which bracket or flank variable sequence regions toyield an orthopoxvirus identifying amplicon. The orthopoxvirusidentifying amplicon can be amplified and is amenable to molecular massdetermination. The molecular mass then provides a means to uniquelyidentify the orthopoxvirus without a requirement for prior knowledge ofthe possible identity of the orthopoxvirus. The molecular mass orcorresponding base composition signature (BCS) of the amplificationproduct is then matched against a database of molecular masses or basecomposition signatures. Furthermore, the method can be applied to rapidparallel multiplex analyses, the results of which can be employed in atriangulation identification strategy. The present method provides rapidthroughput and does not require nucleic acid sequencing of the amplifiedtarget sequence for orthopoxvirus detection and identification.

In the context of the present invention, a “bioagent” is any organism,cell, or virus, living or dead, or a nucleic acid derived from such anorganism, cell or virus. Examples of bioagents include, but are notlimited, to cells, including but not limited to human clinical samples,cell cultures, bacterial cells and other pathogens), viruses, viroids,fungi, protists, parasites, and pathogenicity markers (including but notlimited to: pathogenicity islands, antibiotic resistance genes,virulence factors, toxin genes and other bioregulating compounds).Samples may be alive or dead or in a vegetative state (for example,vegetative bacteria or spores) and may be encapsulated or bioengineered.In the context of this invention, a “pathogen” is a bioagent whichcauses a disease or disorder.

As used herein, “intelligent primers” are primers that are designed tobind to highly conserved sequence regions of a bioagent identifyingamplicon that flank an intervening variable region and yieldamplification products which ideally provide enough variability todistinguish each individual bioagent, and which are amenable tomolecular mass analysis. By the term “highly conserved,” it is meantthat the sequence regions exhibit between about 80-100%, or betweenabout 90-100%, or between about 95-100% identity among all or at least70%, at least 80%, at least 90%, at least 95%, or at least 99% ofspecies or strains.

As used herein, “broad range survey primers” are intelligent primersdesigned to identify an unknown bioagent at the genus level. In somecases, broad range survey primers are able to identify unknown bioagentsat the species or sub-species level. As used herein, “division-wideprimers” are intelligent primers designed to identify a bioagent at thespecies level and “drill-down” primers are intelligent primers designedto identify a bioagent at the sub-species level. As used herein, the“sub-species” level of identification includes, but is not limited to,strains, subtypes, variants, and isolates.

As used herein, a “bioagent division” is defined as group of bioagentsabove the species level and includes but is not limited to, orders,families, classes, clades, genera or other such groupings of bioagentsabove the species level.

As used herein, a “sub-species characteristic” is a geneticcharacteristic that provides the means to distinguish two members of thesame bioagent species. For example, one viral strain could bedistinguished from another viral strain of the same species bypossessing a genetic change (e.g., for example, a nucleotide deletion,addition or substitution) in one of the viral genes, such as theRNA-dependent RNA polymerase. In this case, the sub-speciescharacteristic that can be identified using the methods of the presentinvention is the genetic change in the viral polymerase.

As used herein, the term “bioagent identifying amplicon” refers to apolynucleotide that is amplified from a bioagent in an amplificationreaction whose sequence 1) ideally provides base composition variabilityto distinguish among individual bioagents and 2) whose molecular mass isamenable to molecular mass determination.

As used herein, a “base composition” is the exact number of eachnucleobase (A, T, C and G) in a given sequence. As used herein, a “basecomposition signature” (BCS) is the exact base composition (i.e., thenumber of A, T, G and C nucleobases) determined from the molecular massof a bioagent identifying amplicon.

As used herein, a “base composition probability cloud” is arepresentation of the diversity in base composition resulting from avariation in sequence that occurs among different isolates of a givenspecies. The “base composition probability cloud” represents the basecomposition constraints for each species and is typically visualizedusing a pseudo four-dimensional plot.

As used herein, a “wobble base” is a variation in a codon found at thethird nucleotide position of a DNA triplet. Variations in conservedregions of sequence are often found at the third nucleotide position dueto redundancy in the amino acid code.

In the context of the present invention, the term “unknown bioagent” maymean either: (i) a bioagent whose existence is known (such as the wellknown bacterial species Staphylococcus aureus for example) but which isnot known to be in a sample to be analyzed, or (ii) a bioagent whoseexistence is not known (for example, the SARS coronavirus was unknownprior to April 2003). For example, if the method for identification ofcoronaviruses disclosed in commonly owned U.S. patent Ser. No.10/829,826 (incorporated herein by reference in its entirety) was to beemployed prior to April 2003 to identify the SARS coronavirus in aclinical sample, both meanings of “unknown” bioagent are applicablesince the SARS coronavirus was unknown to science prior to April, 2003and since it was not known what bioagent (in this case a coronavirus)was present in the sample. On the other hand, if the method of U.S.patent Ser. No. 10/829,826 was to be employed subsequent to April 2003to identify the SARS coronavirus in a clinical sample, only the firstmeaning (i) of “unknown” bioagent would apply since the SARS coronavirusbecame known to science subsequent to April 2003 and since it was notknown what bioagent was present in the sample.

As used herein, “triangulation identification” means the employment ofmore than one bioagent identifying amplicons for identification of abioagent.

In the context of the present invention, “viral nucleic acid” includes,but is not limited to, DNA, RNA, or DNA that has been obtained fromviral RNA, such as, for example, by performing a reverse transcriptionreaction. Viral RNA can either be single-stranded (of positive ornegative polarity) or double-stranded.

As used herein, the term “etiology” refers to the causes or origins, ofdiseases or abnormal physiological conditions.

As used herein, the term “nucleobase” is synonymous with other terms inuse in the art including “nucleotide,” “deoxynucleotide,” “nucleotideresidue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” ordeoxynucleotide triphosphate (dNTP).

Despite enormous biological diversity, all forms of life on earth sharesets of essential, common features in their genomes. Since genetic dataprovide the underlying basis for identification of orthopoxvirus by themethods of the present invention, it is desirable to select segments ofnucleic acids which ideally provide enough variability to distinguisheach individual bioagent and whose molecular mass is amenable tomolecular mass determination.

Unlike bacterial genomes, which exhibit conversation of numerous genes(i.e. housekeeping genes) across all organisms, viruses do not share agene that is essential and conserved among all virus families.Therefore, viral identification is achieved within smaller groups ofrelated viruses, such as members of a particular virus family or genus.For example, RNA-dependent RNA polymerase is present in allsingle-stranded RNA viruses and can be used for broad priming as well asresolution within the virus family.

Disclosed in U.S. Patent Application Publication Nos. 2003-0027135,2003-0082539, 2003-0228571, 2004-0209260, 2004-0219517, and2004-0180328, and in U.S. application Ser. Nos. 10/660,997, 10/728,486,10/754,415, and 10/829,826, all of which are commonly owned andincorporated herein by reference in their entirety, are methods foridentification of bioagents (any organism, cell, or virus, living ordead, or a nucleic acid derived from such an organism, cell or virus) inan unbiased manner by molecular mass and base composition analysis of“bioagent identifying amplicons” which are obtained by amplification ofsegments of essential and conserved genes which are involved in, forexample, translation, replication, recombination and repair,transcription, nucleotide metabolism, amino acid metabolism, lipidmetabolism, energy generation, uptake, secretion and the like. Examplesof these proteins include, but are not limited to, ribosomal RNAs,ribosomal proteins, DNA and RNA polymerases, RNA-dependent RNApolymerases, RNA capping and methylation enzymes, elongation factors,tRNA synthetases, protein chain initiation factors, heat shock proteingroEL, phosphoglycerate kinase, NADH dehydrogenase, DNA ligases, DNAgyrases and DNA topoisomerases, helicases, metabolic enzymes, and thelike.

To obtain bioagent identifying amplicons, primers are selected tohybridize to conserved sequence regions which bracket or flank variablesequence regions to yield a segment of nucleic acid which can beamplified and which is amenable to methods of molecular mass analysis.The variable sequence regions provide the variability of molecular masswhich is used for bioagent identification. Upon amplification by PCR orother amplification methods with the specifically chosen primers, anamplification product that represents a bioagent identifying amplicon isobtained. The molecular mass of the amplification product, obtained bymass spectrometry for example, provides the means to uniquely identifythe bioagent without a requirement for prior knowledge of the possibleidentity of the bioagent. The molecular mass of the amplificationproduct or the corresponding base composition (which can be calculatedfrom the molecular mass of the amplification product) is compared with adatabase of molecular masses or base compositions and a match indicatesthe identity of the bioagent. Furthermore, the method can be applied torapid parallel analyses (for example, in a multi-well plate format) theresults of which can be employed in a triangulation identificationstrategy which is amenable to rapid throughput and does not requirenucleic acid sequencing of the amplified target sequence for bioagentidentification.

The result of determination of a previously unknown base composition ofa previously unknown bioagent (for example, a newly evolved andheretofore unobserved virus) has downstream utility by providing newbioagent indexing information with which to populate base compositiondatabases. The process of subsequent bioagent identification analysesis, thus, greatly improved as more base composition data for bioagentidentifying amplicons becomes available.

In some embodiments of the present invention, at least one viral nucleicacid segment is amplified in the process of identifying the viralbioagent. Thus, the nucleic acid segments that can be amplified by theprimers disclosed herein and that provide enough variability todistinguish each individual bioagent and whose molecular masses areamenable to molecular mass determination are herein described as viralbioagent identifying amplicons.

In some embodiments of the present invention, viral bioagent identifyingamplicons comprise from about 45 to about 200 nucleobases (i.e. fromabout 45 to about 200 linked nucleosides; or up to about 200nucleobases). One of ordinary skill in the art will appreciate that theinvention embodies viral bioagent identifying amplicons of 45, 46, 47,48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65,66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114,115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128,129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156,157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170,171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184,185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198,199, and 200 nucleobases in length, or any range therewithin.

It is the combination of the portions of the viral bioagent nucleic acidsegment to which the primers hybridize (hybridization sites) and thevariable region between the primer hybridization sites that comprisesthe viral bioagent identifying amplicon.

In some embodiments, viral bioagent identifying amplicons amenable tomolecular mass determination which are produced by the primers describedherein are either of a length, size or mass compatible with theparticular mode of molecular mass determination or compatible with ameans of providing a predictable fragmentation pattern in order toobtain predictable fragments of a length compatible with the particularmode of molecular mass determination. Such means of providing apredictable fragmentation pattern of an amplification product include,but are not limited to, cleavage with restriction enzymes or cleavageprimers, for example. Thus, in some embodiments, viral bioagentidentifying amplicons are larger than 200 nucleobases and are amenableto molecular mass determination following restriction digestion. Methodsof using restriction enzymes and cleavage primers are well known tothose with ordinary skill in the art.

In some embodiments, amplification products corresponding to viralbioagent identifying amplicons are obtained using the polymerase chainreaction (PCR) which is a routine method to those with ordinary skill inthe molecular biology arts. Other amplification methods may be used suchas ligase chain reaction (LCR), low-stringency single primer PCR, andmultiple strand displacement amplification (MDA). These methods are alsowell known to those with ordinary skill.

Intelligent primers are designed to bind to highly conserved sequenceregions that flank an intervening variable region and yield viralbioagent identifying amplicons upon amplification, which ideally provideenough variability to distinguish each individual viral bioagent, andwhich are amenable to molecular mass analysis. In some embodiments, thehighly conserved sequence regions exhibit between about 80-100%, orbetween about 90-100%, or between about 95-100% identity, or betweenabout 99-100% identity. The molecular mass of a given amplificationproduct provides a means of identifying the viral bioagent from which itwas obtained, due to the variability of the variable region. Thus,design of intelligent primers requires selection of a variable regionwith appropriate variability to resolve the identity of a givenbioagent. Viral bioagent identifying amplicons are ideally specific tothe identity of the viral bioagent, however, this is not an absoluterequirement because multiple viral bioagent identifying amplicons can beused in a triangulation strategy (vide infra).

Identification of viral bioagents can be accomplished at differenttaxonomic levels using intelligent primers suited to resolution of eachindividual level of identification. Broad range survey intelligentprimers are designed with the objective of identifying a bioagent as amember of a particular division (e.g., an order, family, genus or othersuch grouping of viral bioagents above the species level). As anon-limiting example, members of the Orthopoxvirus genus may beidentified as such by employing broad range survey intelligent primerssuch as primers which target RNA or DNA polymerases, helicases, or otherviral genes. In some embodiments, broad range survey intelligent primersare capable of identification of bioagents at the species, sub-speciesor strain level.

Division-wide intelligent primers are designed with an objective ofidentifying a bioagent at the species level. Division-wide intelligentprimers are not always required for identification at the species levelbecause broad range survey intelligent primers may provide sufficientidentification resolution to accomplishing this identificationobjective.

Drill-down intelligent primers are designed with the objective ofidentifying a bioagent at the sub-species level (including strains,subtypes, variants and isolates) based on sub-species characteristics.Drill-down intelligent primers are not always required foridentification at the sub-species level because broad range surveyintelligent primers may provide sufficient identification resolution toaccomplishing this identification objective.

A representative process flow diagram used for primer selection andvalidation process is outlined in FIG. 1. For each group of organisms,candidate target sequences are identified (200) from which nucleotidealignments are created (210) and analyzed (220). Primers are thendesigned by selecting appropriate priming regions (230) which thenenables the selection of candidate primer pairs (240). The primer pairsare then subjected to in silico analysis by electronic PCR (ePCR) (300)wherein bioagent identifying amplicons are obtained from sequencedatabases such as GenBank or other sequence collections (310) andchecked for specificity in silico (320). Bioagent identifying ampliconsobtained from GenBank sequences (310) can also be analyzed by aprobability model which predicts the capability of a given amplicon toidentify unknown bioagents such that the base compositions of ampliconswith favorable probability scores are then stored in a base compositiondatabase (325). Alternatively, base compositions of the bioagentidentifying amplicons obtained from the primers and GenBank sequencescan be directly entered into the base composition database (330).Candidate primer pairs (240) are validated by in vitro amplification bya method such as PCR analysis (400) of nucleic acid from a collection oforganisms (410). Amplification products thus obtained are analyzed toconfirm the sensitivity, specificity and reproducibility of the primersused to obtain the amplification products (420).

Many of the important pathogens, including the organisms of greatestconcern as biological weapons agents, have been completely sequenced.This effort has greatly facilitated the design of primers and probes forthe detection of individual bioagents. Thus, the combination ofbroad-range priming with division-wide and drill-down priming describedherein is being used very successfully in several applications of thetechnology, including environmental surveillance for biowarfare threatagents and clinical sample analysis for medically important pathogens.

Synthesis of primers is well known and routine in the art. The primersmay be conveniently and routinely made through the well-known techniqueof solid phase synthesis. Equipment for such synthesis is sold byseveral vendors including, for example, Applied Biosystems (Foster City,Calif.). Any other means for such synthesis known in the art mayadditionally or alternatively be employed.

The primers are employed as, for example, compositions for use inmethods for identification of viral bioagents as follows: a primer paircomposition is contacted with nucleic acid (such as, for example, DNAfrom a DNA virus, or DNA reverse transcribed from the RNA of an RNAvirus) of an unknown viral bioagent. The nucleic acid is then amplifiedby a nucleic acid amplification technique, such as PCR for example, toobtain an amplification product that represents a viral bioagentidentifying amplicon. The molecular mass of each strand of thedouble-stranded amplification product is determined by a molecular massmeasurement technique such as, for example, mass spectrometry whereinthe two strands of the double-stranded amplification product areseparated during the ionization process. In some embodiments, the massspectrometry is electrospray Fourier transform ion cyclotron resonancemass spectrometry (ESI-FTICR-MS) or electrospray time of flight massspectrometry (ESI-TOF-MS). A list of possible base compositions can begenerated for the molecular mass value obtained for each strand and thechoice of the correct base composition from the list is facilitated bymatching the base composition of one strand with a complementary basecomposition of the other strand. The molecular mass or base compositionthus determined is then compared with a database of molecular masses orbase compositions of analogous bioagent identifying amplicons for knownviral bioagents. A match between the molecular mass or base compositionof the amplification product and the molecular mass or base compositionof an analogous bioagent identifying amplicon for a known viral bioagentindicates the presence and/or identity of the unknown bioagent. In someembodiments, the primer pair used is one of the primer pairs of Table 1.In some embodiments, the method is repeated using a different primerpair to resolve possible ambiguities in the identification process or toimprove the confidence level for the identification assignment.

In some embodiments, a viral bioagent identifying amplicon may beproduced using only a single primer (either the forward or reverseprimer of any given primer pair), provided an appropriate amplificationmethod is chosen, such as, for example, low stringency single primer PCR(LSSP-PCR). Adaptation of this amplification method in order to produceviral bioagent identifying amplicons can be accomplished by one withordinary skill in the art without undue experimentation.

In some embodiments, the oligonucleotide primers are broad range surveyprimers which hybridize to conserved regions of nucleic acid encodingDNA polymerase, RNA polymerase, DNA helicase, RNA helicase, orthioredoxin-like gene of all (or between 80% and 100%, between 85% and100%, between 90% and 100%, or between 95% and 100%) knownorthopoxviruses and produce orthopoxvirus identifying amplicons. As usedherein, the phrase “broad range survey primers” refers to primers thatbind to nucleic acid encoding genes essential to orthopoxvirusreplication (e.g., for example, DNA and RNA polymerases, RNA and RNAhelicases and thioredoxin-like gene) of all (or between 80% and 100%,between 85% and 100%, between 90% and 100%, or between 95% and 100%)known species of orthopoxviruses. In some embodiments, the primer pairscomprise oligonucleotides ranging in length from 13 to 35 nucleobases,each of which have from 70% to 100% sequence identity with any of theprimers shown in Table 1.

In some cases, the molecular mass or base composition of a viralbioagent identifying amplicon defined by a broad range survey primerpair does not provide enough resolution to unambiguously identify aviral bioagent at the species level. These cases benefit from furtheranalysis of one or more viral bioagent identifying amplicons generatedfrom at least one additional broad range survey primer pair or from atleast one additional division-wide primer pair. The employment of morethan one bioagent identifying amplicon for identification of a bioagentis herein referred to as “triangulation identification.”

In other embodiments, the oligonucleotide primers are division-wideprimers which hybridize to nucleic acid encoding genes of species withina genus of viruses. In other embodiments, the oligonucleotide primersare drill-down primers which enable the identification of sub-speciescharacteristics. Drill down primers provide the functionality ofproducing bioagent identifying amplicons for drill-down analyses such asgenotyping or strain typing when contacted with nucleic acid underamplification conditions. Identification of such sub-speciescharacteristics is often critical for determining proper clinicaltreatment of viral infections. In some embodiments, sub-speciescharacteristics are identified using only broad range survey primers anddivision-wide, and drill-down primers are not used.

In some embodiments, the primers used for amplification hybridize to andamplify genomic DNA, DNA of bacterial plasmids, DNA of DNA viruses orDNA reverse transcribed from RNA of an RNA virus.

In some embodiments, the primers used for amplification hybridizedirectly to viral RNA and act as reverse transcription primers forobtaining DNA from direct amplification of viral RNA. Methods ofamplifying RNA using reverse transcriptase are well known to those withordinary skill in the art and can be routinely established without undueexperimentation.

One with ordinary skill in the art of design of amplification primerswill recognize that a given primer need not hybridize with 100%complementarity in order to effectively prime the synthesis of acomplementary nucleic acid strand in an amplification reaction.Moreover, a primer may hybridize over one or more segments such thatintervening or adjacent segments are not involved in the hybridizationevent (e.g., for example, a loop structure or a hairpin structure). Theprimers of the present invention may comprise at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95% or at least99% sequence identity with any of the primers listed in Table 1. Thus,in some embodiments of the present invention, an extent of variation of70% to 100%, or any range therewithin, of the sequence identity ispossible relative to the specific primer sequences disclosed herein.Determination of sequence identity is described in the followingexample: a primer 20 nucleobases in length which differs in contiguousnucleobases from another 20 nucleobase primer by only two residues has18 of 20 identical residues ( 18/20=0.9 or 90% sequence identity). Inanother example, a primer 15 nucleobases in length having all residuesidentical to a 15 nucleobase segment of another primer that is 20nucleobases in length would have 15/20=0.75 or 75% sequence identitywith the 20 nucleobase primer. In yet another example, a first primer,35 nucleobases in length having a 20 nucleobase segment which isidentical to the entire sequence of a second primer of a length of 20nucleobases has 100% sequence identity with the second primer.

Percent homology, sequence identity or complementarity, can bedetermined by, for example, the Gap program (Wisconsin Sequence AnalysisPackage, Version 8 for UNIX, Genetics Computer Group, UniversityResearch Park, Madison Wis.), using default settings, which uses thealgorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489). Insome embodiments, complementarity of primers with respect to theconserved priming regions of viral nucleic acid is between about 70% and100%. In other embodiments, homology, sequence identity orcomplementarity, is between about 80% and 100%. In yet otherembodiments, homology, sequence identity or complementarity, is at least90%, at least 92%, at least 94%, at least 95%, at least 96%, at least97%, at least 98%, at least 99% or is 100%.

In some embodiments, the primers described herein comprise at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 94%, at least 95%, at least 96%, at least 98%, or at least 99%, or100% (or any range therewithin) sequence identity with the primersequences specifically disclosed herein. Thus, for example, a primer mayhave between 70% and 100%, between 75% and 100%, between 80% and 100%,and between 95% and 100% sequence identity with SEQ ID NO: 1. Likewise,a primer may have similar sequence identity with any other primer whosenucleotide sequence is disclosed in Table 1.

One with ordinary skill is able to calculate percent sequence identityor percent sequence homology and able to determine, without undueexperimentation, the effects of variation of primer sequence identity onthe function of the primer in its role in priming synthesis of acomplementary strand of nucleic acid for production of an amplificationproduct of a corresponding viral bioagent identifying amplicon.

In some embodiments of the present invention, the oligonucleotideprimers are 13 to 35 nucleobases in length (13 to 35 linked nucleotideresidues; or up to 35 nucleotide residues). These embodiments compriseoligonucleotide primers 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 nucleobases in length, orany range therewithin.

In some embodiments, any given primer can comprise a modificationcomprising the addition of a non-templated T residue to the 5′ end ofthe primer (i.e., the added T residue does not necessarily hybridize tothe nucleic acid being amplified). The addition of a non-templated Tresidue has an effect of minimizing the addition of non-templated adenylresidues as a result of the non-specific enzyme activity of Taqpolymerase (Magnuson et al., Biotechniques, 1996, 21, 700-709), anoccurrence which may lead to ambiguous results arising from molecularmass analysis.

In some embodiments of the present invention, primers may contain one ormore universal bases. Because any variation (due to codon wobble in the3^(rd) position) in the conserved regions among species is likely tooccur in the third position of a DNA (or RNA) triplet, oligonucleotideprimers can be designed such that the nucleotide corresponding to thisposition is a base which can bind to more than one nucleotide, referredto herein as a “universal nucleobase.” For example, under this “wobble”pairing, inosine (I) binds to U, C or A; guanine (G) binds to U or C,and uridine (U) binds to U or C. Other examples of universal nucleobasesinclude, but are not limited to, nitroindoles such as 5-nitroindole or3-nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14,1001-1003), the degenerate nucleotides dP or dK (Hill et al., Proc.Natl. Acad. Sci. U.S.A., 1998, 95, 4258-4263), an acyclic nucleosideanalog containing 5-nitroindazole (Van Aerschot et al., Nucleosides andNucleotides, 1995, 14, 1053-1056) or the purine analog1-(2-deoxy-β-D-ribofuranosyl)-imidazole-4-carboxamide (Sala et al.,Nucl. Acids Res., 1996, 24, 3302-3306).

In some embodiments, to compensate for the somewhat weaker binding bythe wobble base, the oligonucleotide primers are designed such that thefirst and second positions of each triplet are occupied by nucleotideanalogs which bind with greater affinity than the unmodified nucleotide.Examples of these analogs include, but are not limited to,2,6-diaminopurine which binds to thymine, 5-propynyluracil which bindsto adenine and 5-propynylcytosine and phenoxazines, including G-clamp,which binds to G. Propynylated pyrimidines are described in U.S. Pat.Nos. 5,645,985, 5,830,653 and 5,484,908, each of which is commonly ownedand incorporated herein by reference in its entirety. Propynylatedprimers are described in U.S. Patent Application Publication No.2003-0170682, which is also commonly owned and incorporated herein byreference in its entirety. Phenoxazines are described in U.S. Pat. Nos.5,502,177, 5,763,588, and 6,005,096, each of which is incorporatedherein by reference in its entirety. G-clamps are described in U.S. Pat.Nos. 6,007,992 and 6,028,183, each of which is incorporated herein byreference in its entirety.

In some embodiments, to enable broad priming of rapidly evolving RNAviruses, primer hybridization is enhanced using primers containing5-propynyl deoxycytidine and deoxythymidine nucleotides. These modifiedprimers offer increased affinity and base pairing selectivity.

In some embodiments, non-template primer tags are used to increase themelting temperature (T_(m)) of a primer-template duplex in order toimprove amplification efficiency. A non-template tag is at least threeconsecutive A or T nucleotide residues on a primer which are notcomplementary to the template. In any given non-template tag, A can bereplaced by C or G and T can also be replaced by C or G. AlthoughWatson-Crick hybridization is not expected to occur for a non-templatetag relative to the template, the extra hydrogen bond in a G-C pairrelative to an A-T pair confers increased stability of theprimer-template duplex and improves amplification efficiency forsubsequent cycles of amplification when the primers hybridize to strandssynthesized in previous cycles.

In other embodiments, propynylated tags may be used in a manner similarto that of the non-template tag, wherein two or more 5-propynylcytidineor 5-propynyluridine residues replace template matching residues on aprimer. In other embodiments, a primer contains a modifiedinternucleoside linkage such as a phosphorothioate linkage, for example.

In some embodiments, the primers contain mass-modifying tags. Reducingthe total number of possible base compositions of a nucleic acid ofspecific molecular weight provides a means of avoiding a persistentsource of ambiguity in determination of base composition ofamplification products. Addition of mass-modifying tags to certainnucleobases of a given primer will result in simplification of de novodetermination of base composition of a given bioagent identifyingamplicon from its molecular mass.

In some embodiments of the present invention, the mass modifiednucleobase comprises one or more of the following: for example,7-deaza-2′-deoxyadenosine-5-triphosphate,5-iodo-2′-deoxyuridine-5′-triphosphate,5-bromo-2′-deoxyuridine-5′-triphosphate,5-bromo-2′-deoxycytidine-5′-triphosphate,5-iodo-2′-deoxycytidine-5′-triphosphate,5-hydroxy-2′-deoxyuridine-5′-triphosphate,4-thiothymidine-5′-triphosphate, 5-aza-2′-deoxyuridine-5′-triphosphate,5-fluoro-2′-deoxyuridine-5′-triphosphate,O6-methyl-2′-deoxyguanosine-5′-triphosphate,N2-methyl-2′-deoxyguanosine-5′-triphosphate,8-oxo-2′-deoxyguanosine-5′-triphosphate, orthiothymidine-5′-triphosphate. In some embodiments, the mass-modifiednucleobase comprises ¹⁵N or ¹³C or both ¹⁵N and ¹³C.

In some cases, a molecular mass of a given bioagent identifying ampliconalone does not provide enough resolution to unambiguously identify agiven bioagent. The employment of more than one viral bioagentidentifying amplicon for identification of a bioagent is herein referredto as triangulation identification. Triangulation identification ispursued by analyzing a plurality of bioagent identifying ampliconsselected within multiple genes. This process is used to reduce falsenegative and false positive signals, and enable reconstruction of theorigin of hybrid or otherwise engineered bioagents. For example,identification of the three part toxin genes typical of B. anthracis(Bowen et al., J. Appl. Microbiol., 1999, 87, 270-278) in the absence ofthe expected signatures from a representative orthopoxvirus genome wouldsuggest a genetic engineering event.

In some embodiments, the triangulation identification process can bepursued by characterization of bioagent identifying amplicons in amassively parallel fashion using the polymerase chain reaction (PCR),such as multiplex PCR where multiple primers are employed in the sameamplification reaction mixture, or PCR in multi-well plate formatwherein a different and unique pair of primers is used in multiple wellscontaining otherwise identical reaction mixtures. Such multiplex andmulti-well PCR methods are well known to those with ordinary skill inthe arts of rapid throughput amplification of nucleic acids.

In some embodiments, the molecular mass of a given viral bioagentidentifying amplicon is determined by mass spectrometry. Massspectrometry has several advantages, not the least of which is highbandwidth characterized by the ability to separate (and isolate) manymolecular peaks across a broad range of mass to charge ratio (m/z). Thusmass spectrometry is intrinsically a parallel detection scheme withoutthe need for radioactive or fluorescent labels or probes, since everyamplification product is identified by its molecular mass. The currentstate of the art in mass spectrometry is such that less than femtomolequantities of material can be readily analyzed to afford informationabout the molecular contents of the sample. An accurate assessment ofthe molecular mass of the material can be quickly obtained, irrespectiveof whether the molecular weight of the sample is several hundred, or inexcess of one hundred thousand atomic mass units (amu) or Daltons.

In some embodiments, intact molecular ions are generated fromamplification products using one of a variety of ionization techniquesto convert the sample to gas phase. These ionization methods include,but are not limited to, electrospray ionization (ES), matrix-assistedlaser desorption ionization (MALDI) and fast atom bombardment (FAB).Upon ionization, several peaks are observed from one sample due to theformation of ions with different charges. Averaging the multiplereadings of molecular mass obtained from a single mass spectrum affordsan estimate of molecular mass of the bioagent identifying amplicon.Electrospray ionization mass spectrometry (ESI-MS) is particularlyuseful for very high molecular weight polymers such as proteins andnucleic acids having molecular weights greater than 10 kDa, since ityields a distribution of multiply-charged molecules of the samplewithout causing a significant amount of fragmentation.

The mass detectors used in the methods of the present invention include,but are not limited to, Fourier transform ion cyclotron resonance massspectrometry (FT-ICR-MS), time of flight (TOF), ion trap, quadrupole,magnetic sector, Q-TOF, and triple quadrupole.

Although the molecular mass of amplification products obtained usingintelligent primers provides a means for identification of bioagents,conversion of molecular mass data to a base composition signature isuseful for certain analyses. As used herein, a base compositionsignature (BCS) is the exact base composition determined from themolecular mass of a bioagent identifying amplicon. In one embodiment, aBCS provides an index of a specific gene in a specific organism. As usedherein, a base composition is the exact number of each nucleobase (A, T,C and G).

RNA viruses depend on error-prone polymerases for replication andtherefore their nucleotide sequences (and resultant base compositions)drift over time within the functional constraints allowed by selectionpressure. Base composition probability distribution of a viral speciesor group represents a probabilistic distribution of the above variationin the A, C, G and T base composition space and can be derived byanalyzing base compositions of, for example, all known isolates of thatparticular species.

In some embodiments, assignment of the likelihood that a previouslyunknown or un-indexed base composition corresponds to a particularvirus, or a related member of a group of viruses is accomplished usingbase composition probability clouds or base composition densitypolyhedrons. Base compositions, like sequences, vary slightly fromisolate to isolate within species or individual genotypes. It ispossible to manage this diversity by building base compositionprobability clouds around the composition constraints for each species.This permits identification of organisms in a fashion similar tosequence analysis. A pseudo four-dimensional plot can be used tovisualize the concept of base composition probability clouds. Likewise,a system of tetrahedral axes can be used to build a polyhedron accordingto seven base composition constraints. Optimal primer design requiresoptimal choice of bioagent identifying amplicons and maximizes theseparation between the base composition signatures of individualbioagents. Areas where clouds overlap indicate regions that may resultin a misclassification, a problem which is overcome by a triangulationidentification process using bioagent identifying amplicons not affectedby overlap of base composition probability clouds or densitypolyhedrons.

In some embodiments, pre-calculated base composition probability cloudsprovide the means for screening potential primer pairs in order to avoidpotential misclassifications of base compositions. In other embodiments,base composition probability clouds provide the means for predicting theidentity of a bioagent whose assigned base composition was notpreviously observed and/or indexed in a bioagent identifying ampliconbase composition database due to evolutionary transitions in its nucleicacid sequence. Thus, in contrast to probe-based techniques, massspectrometry determination of base composition does not require priorknowledge of the composition or sequence in order to make themeasurement. Methods of calculating base composition probability cloudsare described in U.S. Patent Application Publication No. 2004-0209260.Likewise methods of calculating base composition density polyhedrons aredescribed in U.S. patent application Ser. No. 11/073,362.

The present invention provides bioagent classifying information similarto DNA sequencing and phylogenetic analysis at a level sufficient toidentify a given bioagent. Furthermore, the process of determination ofa previously unknown base composition for a given bioagent (for example,in a case where sequence information is unavailable) has downstreamutility by providing additional bioagent indexing information with whichto populate base composition databases. The process of future bioagentidentification is, thus, greatly improved as more base compositionsbecome available in base composition databases.

In some embodiments, the identity and quantity of an unknown bioagentcan be determined using a representative process illustrated in FIG. 2.Primers (500) and a known quantity of a calibration polynucleotide (505)are added to a sample containing nucleic acid of an unknown bioagent(508). The total nucleic acid in the sample is then subjected to anamplification reaction to obtain amplification products (510). Themolecular masses of amplification products are determined from which areobtained molecular mass and abundance data (515). The molecular mass ofthe bioagent identifying amplicon (520) provides the means for itsidentification (525) and the molecular mass of the calibration ampliconobtained from the calibration polynucleotide (530) provides the meansfor its identification (535). The abundance data of the bioagentidentifying amplicon (540) is recorded and the abundance data for thecalibration data (545) is recorded, both of which are used in acalculation which determines the quantity of unknown bioagent in thesample (550).

For concurrent identification and quantitation of an unknown bioagent, asample comprising the unknown bioagent is contacted with a pair ofprimers which provide the means for amplification of nucleic acid fromthe bioagent, and a known quantity of a polynucleotide that comprises acalibration sequence. The nucleic acids of the bioagent and of thecalibration sequence are amplified and the rate of amplification isreasonably assumed to be similar for the nucleic acid of the bioagentand of the calibration sequence. The amplification reaction thenproduces two amplification products: a bioagent identifying amplicon anda calibration amplicon. The bioagent identifying amplicon and thecalibration amplicon should be distinguishable by molecular mass whilebeing amplified at essentially the same rate. Effecting differentialmolecular masses can be accomplished by choosing as a calibrationsequence, a representative bioagent identifying amplicon (from aspecific species of bioagent) and performing, for example, a 2-8nucleobase deletion or insertion within the variable region between thetwo priming sites. The amplified sample containing the bioagentidentifying amplicon and the calibration amplicon is then subjected tomolecular mass analysis by, for example, mass spectrometry. Theresulting molecular mass analysis of the nucleic acid of the bioagentand of the calibration sequence provides molecular mass data andabundance data for the nucleic acid of the bioagent and of thecalibration sequence. The molecular mass data obtained for the nucleicacid of the bioagent enables identification of the unknown bioagent andthe abundance data enables calculation of the quantity of the bioagent,based on the knowledge of the quantity of calibration polynucleotidecontacted with the sample.

In some embodiments, construction of a standard curve where the amountof calibration polynucleotide spiked into the sample is varied providesadditional resolution and improved confidence for the determination ofthe quantity of bioagent in the sample. The use of standard curves foranalytical determination of molecular quantities is well known to onewith ordinary skill and can be performed without undue experimentation.

In some embodiments, multiplex amplification is performed where multiplebioagent identifying amplicons are amplified with multiple primer pairswhich also amplify the corresponding standard calibration sequences. Inthis or other embodiments, the standard calibration sequences areoptionally included within a single vector which functions as thecalibration polynucleotide. Multiplex amplification methods are wellknown to those with ordinary skill and can be performed without undueexperimentation. However, for the purpose of measurement of bioagentidentifying amplicons by mass spectrometry, it is advantageous to ensurethat no single strand of a double stranded bioagent identifying ampliconhas a molecular mass substantially similar to another single strandpresent in the multiplex amplification mixture to avoid the presence ofoverlapping mass peaks in the resulting mass spectrum.

In some embodiments, the calibrant polynucleotide is used as an internalpositive control to confirm that amplification conditions and subsequentanalysis steps are successful in producing a measurable amplicon. Evenin the absence of copies of the genome of a bioagent, the calibrationpolynucleotide should give rise to a calibration amplicon. Failure toproduce a measurable calibration amplicon indicates a failure ofamplification or subsequent analysis step such as amplicon purificationor molecular mass determination. Reaching a conclusion that suchfailures have occurred is in itself, a useful event.

In some embodiments, the calibration sequence is comprised of DNA. Insome embodiments, the calibration sequence is comprised of RNA.

In some embodiments, the calibration sequence is inserted into a vectorwhich then itself functions as the calibration polynucleotide. In someembodiments, more than one calibration sequence is inserted into thevector that functions as the calibration polynucleotide. Such acalibration polynucleotide is herein termed a “combination calibrationpolynucleotide.” The process of inserting polynucleotides into vectorsis routine to those skilled in the art and can be accomplished withoutundue experimentation. Thus, it should be recognized that thecalibration method should not be limited to the embodiments describedherein. The calibration method can be applied for determination of thequantity of any bioagent identifying amplicon when an appropriatestandard calibrant polynucleotide sequence is designed and used. Theprocess of choosing an appropriate vector for insertion of a calibrantis also a routine operation that can be accomplished by one withordinary skill without undue experimentation.

Bioagents that can be identified by the methods of the present inventioninclude RNA viruses. The genomes of RNA viruses can be positive-sensesingle-stranded RNA, negative-sense single-stranded RNA ordouble-stranded RNA. Examples of RNA viruses with positive-sensesingle-stranded genomes include, but are not limited to members of theCaliciviridae, Picornaviridae, Flaviviridae, Togaviridae, Retroviridaeand Coronaviridae families. Examples of RNA viruses with negative-sensesingle-stranded RNA genomes include, but are not limited to, members ofthe Filoviridae, Rhabdoviridae, Bunyaviridae, Orthomyxoviridae,Paramyxoviridae and Arenaviridae families. Examples of RNA viruses withdouble-stranded RNA genomes include, but are not limited to, members ofthe Reoviridae and Bimaviridae families.

In some embodiments of the present invention, RNA viruses are identifiedby first obtaining RNA from an RNA virus, or a sample containing orsuspected of containing an RNA virus, obtaining corresponding DNA fromthe RNA by reverse transcription, amplifying the DNA to obtain one ormore amplification products using one or more pairs of oligonucleotideprimers that bind to conserved regions of the RNA viral genome, whichflank a variable region of the genome, determining the molecular mass orbase composition of the one or more amplification products and comparingthe molecular masses or base compositions with calculated orexperimentally determined molecular masses or base compositions of knownRNA viruses, wherein at least one match identifies the RNA virus.Methods of isolating RNA from RNA viruses and/or samples containing RNAviruses, and reverse transcribing RNA to DNA are well known to those ofskill in the art.

Orthopoxviruses represent DNA virus examples of viral bioagents whichcan be identified by the methods of the present invention.Orthopoxviruses are extremely diverse at the nucleotide and proteinsequence levels and are thus difficult to detect and identify usingcurrently available diagnostic techniques.

In some embodiments of the present invention, the orthopoxvirus targetgene is DNA polymerase, RNA polymerase, DNA helicase, RNA helicase, orthioredoxin-like gene.

In other embodiments of the present invention, the intelligent primersproduce bioagent identifying amplicons within stable and highlyconserved regions of orthopoxvirus genomes. The advantage tocharacterization of an amplicon in a highly conserved region is thatthere is a low probability that the region will evolve past the point ofprimer recognition, in which case, the amplification step would fail.Such a primer set is, thus, useful as, for example, a broad rangesurvey-type primer. In another embodiment of the present invention, theintelligent primers produce bioagent identifying amplicons in a regionwhich evolves more quickly than the stable region described above. Theadvantage of characterization bioagent identifying ampliconcorresponding to an evolving genomic region is that it is useful fordistinguishing emerging strain variants.

The present invention also has significant advantages as a platform foridentification of diseases caused by emerging viruses. The presentinvention eliminates the need for prior knowledge of bioagent sequenceto generate hybridization probes. Thus, in another embodiment, thepresent invention provides a means of determining the etiology of avirus infection when the process of identification of viruses is carriedout in a clinical setting and, even when the virus is a new speciesnever observed before. This is possible because the methods are notconfounded by naturally occurring evolutionary variations (a majorconcern for characterization of viruses which evolve rapidly) occurringin the sequence acting as the template for production of the bioagentidentifying amplicon. Measurement of molecular mass and determination ofbase composition is accomplished in an unbiased manner without sequenceprejudice.

Another embodiment of the present invention also provides a means oftracking the spread of any species or strain of virus when a pluralityof samples obtained from different locations are analyzed by the methodsdescribed above in an epidemiological setting. In one embodiment, aplurality of samples from a plurality of different locations is analyzedwith primers which produce viral bioagent identifying amplicons, asubset of which contains a specific virus. The corresponding locationsof the members of the virus-containing subset indicate the spread of thespecific virus to the corresponding locations.

The present invention also provides kits for carrying out the methodsdescribed herein. In some embodiments, the kit may comprise a sufficientquantity of one or more primer pairs to perform an amplificationreaction on a target polynucleotide from a bioagent to form a bioagentidentifying amplicon. In some embodiments, the kit may comprise from oneto fifty primer pairs, from one to twenty primer pairs, from one to tenprimer pairs, or from two to five primer pairs. In some embodiments, thekit may comprise one or more, two or more, three or more, or four ormore primer pairs, wherein each member of the pair is of a length of 13to 35 nucleobases and has 70% to 100% sequence identity with any of theprimers recited in Table 1.

In some embodiments, the kit may comprise one or more broad range surveyprimer(s), division wide primer(s), or drill-down primer(s), or anycombination thereof. A kit may be designed so as to comprise particularprimer pairs for identification of a particular bioagent. For example, abroad range survey primer kit may be used initially to identify anunknown bioagent as a member of the Orthopoxvirus genus. Another exampleof a division-wide kit may be used to distinguish Bangladesh 1975,India-1967 and Garcia-1966 strains of variola virus from each other. Adrill-down kit may be used, for example, to distinguish differentsubtypes or genotypes of orthopoxviruses. In some embodiments, any ofthese kits may be combined to comprise a combination of broad rangesurvey primers and division-wide primers so as to be able to identifythe species of an unknown bioagent.

In some embodiments, the kit may contain standardized calibrationpolynucleotides for use as internal amplification calibrants. Internalcalibrants are described in commonly owned U.S. Patent Application Ser.No. 60/545,425, which is incorporated herein by reference in itsentirety.

In some embodiments, the kit may also comprise a sufficient quantity ofreverse transcriptase (if an RNA virus is to be identified for example),a DNA polymerase, suitable nucleoside triphosphates (including any ofthose described above), a DNA ligase, and/or reaction buffer, or anycombination thereof, for the amplification processes described above. Akit may further include instructions pertinent for the particularembodiment of the kit, such instructions describing the primer pairs andamplification conditions for operation of the method. A kit may alsocomprise amplification reaction containers such as microcentrifuge tubesand the like. A kit may also comprise reagents or other materials forisolating bioagent nucleic acid or bioagent identifying amplicons fromamplification, including, for example, detergents, solvents, or ionexchange resins which may be linked to magnetic beads. A kit may alsocomprise a container such as a 96-well plate. A kit may also comprise atable of measured or calculated molecular masses and/or basecompositions of bioagents using the primer pairs of the kit.

While the present invention has been described with specificity inaccordance with certain of its embodiments, the following examples serveonly to illustrate the invention and are not intended to limit the same.In order that the invention disclosed herein may be more efficientlyunderstood, examples are provided below. It should be understood thatthese examples are for illustrative purposes only and are not to beconstrued as limiting the invention in any manner.

EXAMPLES Example 1 Orthopoxvirus Identifying Amplicons

For design of primers that define orthopoxvirus identifying amplicons,all available sequences for members of the Orthopoxvirus genus wereobtained from GenBank and the Poxvirus database (world wide web atpoxvirus.org) and aligned and scanned for regions where pairs of PCRprimers would amplify products between about 45 to about 200 nucleotidesin length and distinguish species and/or sub-species from each other bytheir molecular masses or base compositions. A typical process shown inFIG. 1 is employed.

A database of expected base compositions for each primer region isgenerated using an in silico PCR search algorithm, such as (ePCR). Anexisting RNA structure search algorithm (Macke et al., Nucl. Acids Res.,2001, 29, 4724-4735, which is incorporated herein by reference in itsentirety) has been modified to include PCR parameters such ashybridization conditions, mismatches, and thermodynamic calculations(SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, whichis incorporated herein by reference in its entirety). This also providesinformation on primer specificity of the selected primer pairs.

Table 1 represents a collection of primers (sorted by forward primername) designed to identify orthopoxviruses using the methods describedherein. Primer sites were identified on five essential genes: DNApolymerase (E9L), RNA polymerase (A24R) DNA helicase (A18R), RNAhelicase (K8R) and thioredoxin-like gene (A25L). The forward or reverseprimer name shown in Table 1 indicates the gene region of the viralgenome to which the primer hybridizes relative to a reference sequence.For example, the forward primer name K8R_NC001611_(—)221_(—)238_Findicates a forward primer “_F” that hybridizes to residues 221-238 ofan orthopoxvirus reference sequence represented by GenBank Accession No.NC001611. In Table 1, T^(a)=5-propynyluracil (a propynylated version ofT); and C^(a)=5-propynylcytosine (a propynylated version of C). Theprimer pair number is an in-house database index number.

TABLE 1 Primer Pairs for Identification of Orthopoxvirus Bioagents ForRev Primer Forward SEQ Reverse SEQ Pair Primer ID Primer ID Number NameForward Sequence NO: Name Reverse Sequence NO: 296 A18R_NC001611_GAAGT^(a)T^(a)GAAC^(a)C^(a)GGGA 1 A18R_NC001611_ATTATCGGT^(a)C^(a)GT^(a)T^(a)GT^(a)AA 24 100_117P_F TCA 187_207P_R TGT297 A18R_NC001611_ CTGT^(a)C^(a)T^(a)GTAGATAAA 2 A18R_NC001611_CGTTC^(a)T^(a)T^(a)C^(a)T^(a)C^(a)T^(a)GGAGGA 25 1348_1370P_FC^(a)T^(a)AGGATT 1428_1445P_R T 298 K8R_NC001611_CT^(a)C^(a)C^(a)TC^(a)C^(a)ATCAC^(a)T^(a) 3 K8R_NC001611_CTATAACAT^(a)T^(a)C^(a)AAAGC^(a)T^(a)T^(a) 26 221_238P_F AGGAA290_311P_R ATTG 299 E9L_NC001611_ CGATAC^(a)T^(a)AC^(a)GGACGC 4E9L_NC001611_ CTTTATGAAT^(a)T^(a)AC^(a)T^(a)T^(a)T^(a)AC 27 1119_1133P_F1201_1222P_R ATAT 300 A25L_NC001611_ GTAC^(a)T^(a)GAAT^(a)C^(a)GC^(a) 5A25L_NC001611_ GTGAATAAAGTAT^(a)C^(a)GC^(a)C^(a)C^(a) 28 28_45P_FC^(a)TAAG 105_127P_R T^(a)AATA 301 A24R_NC001611_CGCGAT^(a)AAT^(a)AGATAGT^(a) 6 A24R_NC001611_GCTTC^(a)C^(a)AC^(a)CAGGT^(a)CAT^(a)TA 29 795_817P_F GC^(a)T^(a)AAAC860_878P_R A 308 A18R_NC001611_ GAAGTTGAACCGGGATCA 1 A18R_NC001611_ATTATCGGTCGTTGTTAATGT 24 100_117_F 187_207_R 309 A18R_NC001611_CTGTCTGTAGATAAACTA 2 A18R_NC001611_ CGTTCTTCTCTGGAGGAT 25 1348_1370_FGGATT 1428_1445_R 310 K8R_NC001611_ CTCCTCCATCACTAGGAA 3 K8R_NC001611_CTATAACATTCAAAGCTTATTG 26 221_238_F 290_311_R 311 E9L_NC001611_CGATACTACGGACGC 4 E9L_NC001611_ CTTTATGAATTACTTTACATAT 27 1119_1133_F1201_1222_R 312 A25L_NC001611_ GTACTGAATCCGCCTAAG 5 A25L_NC001611_GTGAATAAAGTATCGCCCTAAT 28 28_45_F 105_127_R A 313 A24R_NC001611_CGCGATAATAGATAGTGC 6 A24R_NC001611_ GCTTCCACCAGGTCATTAA 29 795_817_FTAAAC 860_878_R 488 A18R_NC001611_ TAGAAGT^(a)T^(a)GAAC^(a)C^(a)GG 7A18R_NC001611_ TATTATCGGT^(a)C^(a)GT^(a)T^(a)GT^(a)T^(a)A 30 98_117P_FGATCA 187_208P_R ATGT 489 A18R_NC001611_ TCTGT^(a)C^(a)T^(a)GTAGATAAA 8A18R_NC001611_ TCGTTC^(a)T^(a)T^(a)C^(a) 31 1347_1370P_FC^(a)T^(a)AGGATT 1428_1446P_R T^(a)C^(a)T^(a)GGAGGAT 490 K8R_NC001611_TCT^(a)C^(a)C^(a)TC^(a)C^(a)ATCAC^(a) 9 K8R_NC001611_TCTATAACAT^(a)T^(a)C^(a)AAAGC^(a)T^(a) 32 220_238P_F T^(a)AGGAA290_312P_R T^(a)ATTG 491 E9L_NC001611_ TCGATAC^(a)T^(a)AC^(a)GGACGC 10E9L_NC001611_ TCTTTATGAAT^(a)T^(a)AC^(a)T^(a)T^(a)T^(a)A 33 1118_1133P_F1201_1223P_R CATAT 492 A25L_NC001611_TGTAC^(a)T^(a)GAAT^(a)C^(a)C^(a)GC^(a) 11 A25L_NC001611_TGTGAATAAAGTAT^(a)C^(a)GC^(a)C^(a) 34 27_45P_F C^(a)TAAG 105_128P_RC^(a)T^(a)AATA 493 A24R_NC001611_ TCGCGAT^(a)AAT^(a)AGATAGT^(a) 12A24R_NC001611_ TGCTTC^(a)C^(a)AC^(a)CAGGT^(a)CAT^(a)T 35 794_817P_FGC^(a)T^(a)AAAC 860_879P_R AA 979 A18R_NC001611_ TGATTTCGTAGAAGTTGA 13A18R_NC001611_ TCGCGATTTTATTATCGGTCGT 36 90_117_F ACCGGGATCA 187_217_RTGTTAATGT 980 A18R_NC001611_ TTCTCCCTAGAAGTTGAA 14 A18R_NC001611_TCCCTCCCTATTATCGGTCGTT 37 91_117_F CCGGGATCA 187_216_R GTTAATGT 981E9L_NC001611_ TGGTGACGATACTACGGA 15 E9L_NC001611_ TCCCTCCCAATATCTTTACGAA38 1113_1133_F CGC 1201_1235_R TTACTTTACATAT 982 E9L_NC001611_TCGGTGACGATACTACGG 16 E9L_NC001611_ TCCTCCCTCCCATCTTTACGAA 391112_1133_F ACGC 1205_1235_R TTACTTTAC 983 E9L_NC001611_TCGGTGACGATACTACGG 17 E9L_NC001611_ TCCTCCCTCCCAATATCTTTAC 401112_1133_F ACGC 1205_1238_R GAATTACTTTAC 984 K8R_NC001611_TGGAAAAAAAGTATCTCC 18 K8R_NC001611_ TCCCTCCCGAAAACTATAACAT 41 207_238_FTCCATCACTAGGAA 290_324_R TCAAAGCTTATTG 985 K8R_NC001611_TGGAAAGTATCTCCTCCA 19 K8R_NC001611_ TCCCTCCCTCCCTATAACATTC 42 211_242_FTCACTAGGAAAACC 290_322_R AAAGCTTATTG 986 K8R_NC001611_TCCCTCCTCTCCTCCATC 20 K8R_NC001611_ TCCTCCCTCCCTAACATTCAAA 43 213_238_FACTAGGAA 290_319_R GCTTATTG 987 A24R_NC001611_ TCTAGTAAACGCGATAAT 21A24R_NC001611_ TGTTCAGCTTCCACCAGGTCAT 44 786_818_F AGATAGTGCTAAACG860_884_R TAA 988 A24R_NC001611_ TCCTCCTCGCGATAATAG 22 A24R_NC001611_TGTGTTCAGCTTCCACCAGGTC 45 788_818_F ATAGTGCTAAACG 860_886_R ATTAA 989A24R_NC001611_ TCCTCCCGCGATAATAGA 23 A24R_NC001611_TCCCAGCTTCCACCAGGTCATT 46 789_817_F TAGTGCTAAAC 860_883_R AA 1066A18R_NC001611_ TGATTTCGTAGAAGTTGA 13 A18R_NC001611_TCCCTCCCTATTATCGGTCGTT 47 90_117_F ACCGGGATCA 187_216_R GTTAATGT 1067A18R_NC001611_ TTCTCCCTAGAAGTTGAA 14 A18R_NC001611_TCGCGATTTTATTATCGGTCGT 48 91_117_F CCGGGATCA 187_217_R TGTTAATGT

Example 2 DNA Isolation and Amplification

Genomic materials from culture samples or swabs are prepared using theDNeasy® 96 Tissue Kit (Qiagen, Valencia, Calif.). All PCR reactions areassembled in 50 μl reactions in a 96 well microtiter plate format usinga Packard MPII liquid handling robotic platform and MJ Dyad®thermocyclers (MJ research, Waltham, Mass.). The PCR reaction consistsof 4 units of Amplitaq Gold®, 1× buffer II (Applied Biosystems, FosterCity, Calif.), 1.5 mM MgCl₂, 0.4 M betaine, 800 μM of dNTP mixture, and250 nM of each primer.

The following PCR conditions can be used to amplify the sequences usedfor mass spectrometry analysis: 95° C. for 10 minutes followed by 8cycles of 95° C. for 30 seconds, 48° C. for 30 seconds, and 72° C. for30 seconds, with the 48° C. annealing temperature increased 0.9° C.after each cycle. The PCR is then continued for 37 additional cycles of95° C. for 15 seconds, 56° C. for 20 seconds, and 72° C. for 20 seconds

Example 3 Solution Capture Purification of PCR Products for MassSpectrometry with Ion Exchange Resin-Magnetic Beads

For solution capture of nucleic acids with ion exchange resin linked tomagnetic beads, 25 μl of a 2.5 mg/mL suspension of BioClon amineterminated supraparamagnetic beads were added to 25 to 50 μl of a PCR(or RT-PCR) reaction containing approximately 10 pM of a typical PCRamplification product. The above suspension was mixed for approximately5 minutes by vortexing or pipetting, after which the liquid was removedafter using a magnetic separator. The beads containing bound PCRamplification product were then washed 3 times with 50 mM ammoniumbicarbonate/50% MeOH or 100 mM ammonium bicarbonate/50% MeOH, followedby three more washes with 50% MeOH. The bound PCR amplicon was elutedwith 25 mM piperidine, 25 mM imidazole, 35% MeOH, plus peptidecalibration standards.

Example 4 Mass Spectrometry and Base Composition Analysis

The ESI-FTICR mass spectrometer is based on a Bruker Daltonics(Billerica, Mass.) Apex II 70e electrospray ionization Fourier transformion cyclotron resonance mass spectrometer that employs an activelyshielded 7 Tesla superconducting magnet. The active shielding constrainsthe majority of the fringing magnetic field from the superconductingmagnet to a relatively small volume. Thus, components that might beadversely affected by stray magnetic fields, such as CRT monitors,robotic components, and other electronics, can operate in closeproximity to the FTICR spectrometer. All aspects of pulse sequencecontrol and data acquisition were performed on a 600 MHz Pentium II datastation running Bruker's Xmass software under Windows NT 4.0 operatingsystem. Sample aliquots, typically 15 were extracted directly from96-well microtiter plates using a CTC HTS PAL autosampler (LEAPTechnologies, Carrboro, N.C.) triggered by the FTICR data station.Samples were injected directly into a 10 μl A sample loop integratedwith a fluidics handling system that supplies the 100 μl /hr flow rateto the ESI source. Ions were formed via electrospray ionization in amodified Analytica (Branford, Conn.) source employing an off axis,grounded electrospray probe positioned approximately 1.5 cm from themetalized terminus of a glass desolvation capillary. The atmosphericpressure end of the glass capillary was biased at 6000 V relative to theESI needle during data acquisition. A counter-current flow of dry N₂ wasemployed to assist in the desolvation process. Ions were accumulated inan external ion reservoir comprised of an rf-only hexapole, a skimmercone, and an auxiliary gate electrode, prior to injection into thetrapped ion cell where they were mass analyzed. Ionization dutycycles >99% were achieved by simultaneously accumulating ions in theexternal ion reservoir during ion detection. Each detection eventconsisted of 1M data points digitized over 2.3 s. To improve thesignal-to-noise ratio (S/N), 32 scans were co-added for a total dataacquisition time of 74 s.

The ESI-TOF mass spectrometer is based on a Bruker Daltonics MicroTOFT™.Ions from the ESI source undergo orthogonal ion extraction and arefocused in a reflectron prior to detection. The TOF and FTICR areequipped with the same automated sample handling and fluidics describedabove. Ions are formed in the standard MicroTOFT™ ESI source that isequipped with the same off-axis sprayer and glass capillary as the FTICRESI source. Consequently, source conditions were the same as thosedescribed above. External ion accumulation was also employed to improveionization duty cycle during data acquisition. Each detection event onthe TOF was comprised of 75,000 data points digitized over 75 μs.

The sample delivery scheme allows sample aliquots to be rapidly injectedinto the electrospray source at high flow rate and subsequently beelectrosprayed at a much lower flow rate for improved ESI sensitivity.Prior to injecting a sample, a bolus of buffer was injected at a highflow rate to rinse the transfer line and spray needle to avoid samplecontamination/carryover. Following the rinse step, the autosamplerinjected the next sample and the flow rate was switched to low flow.Following a brief equilibration delay, data acquisition commenced. Asspectra were co-added, the autosampler continued rinsing the syringe andpicking up buffer to rinse the injector and sample transfer line. Ingeneral, two syringe rinses and one injector rinse were required tominimize sample carryover. During a routine screening protocol a newsample mixture was injected every 106 seconds. More recently a fast washstation for the syringe needle has been implemented which, when combinedwith shorter acquisition times, facilitates the acquisition of massspectra at a rate of just under one spectrum/minute.

Raw mass spectra were post-calibrated with an internal mass standard anddeconvoluted to monoisotopic molecular masses. Unambiguous basecompositions were derived from the exact mass measurements of thecomplementary single-stranded oligonucleotides. Quantitative results areobtained by comparing the peak heights with an internal PCR calibrationstandard present in every PCR well at 500 molecules per well.Calibration methods are commonly owned and disclosed in U.S. ProvisionalPatent Application Ser. No. 60/545,425.

Example 5 De Novo Determination of Base Composition of AmplificationProducts Using Molecular Mass Modified Deoxynucleotide Triphosphates

Because the molecular masses of the four natural nucleobases have arelatively narrow molecular mass range (A=313.058, G=329.052, C=289.046,T=304.046—See Table 2), a persistent source of ambiguity in assignmentof base composition can occur as follows: two nucleic acid strandshaving different base composition may have a difference of about 1 Dawhen the base composition difference between the two strands is G

A (−15.994) combined with C

T (+15.000). For example, one 99-mer nucleic acid strand having a basecomposition of A₂₇G₃₀C₂₁T₂₁ has a theoretical molecular mass of30779.058 while another 99-mer nucleic acid strand having a basecomposition of A₂₆G₃₁C₂₂T₂₀ has a theoretical molecular mass of30780.052. A 1 Da difference in molecular mass may be within theexperimental error of a molecular mass measurement and thus, therelatively narrow molecular mass range of the four natural nucleobasesimposes an uncertainty factor.

The present invention provides for a means for removing this theoretical1 Da uncertainty factor through amplification of a nucleic acid with onemass-tagged nucleobase and three natural nucleobases. The term“nucleobase” as used herein is synonymous with other terms in use in theart including “nucleotide,” “deoxynucleotide,” “nucleotide residue,”“deoxynucleotide residue,” “nucleotide triphosphate (NTP),” ordeoxynucleotide triphosphate (dNTP).

Addition of significant mass to one of the 4 nucleobases (dNTPs) in anamplification reaction, or in the primers themselves, will result in asignificant difference in mass of the resulting amplification product(significantly greater than 1 Da) arising from ambiguities arising fromthe G

A combined with C

T event (Table 2). Thus, the same the G

A (−15.994) event combined with 5-Iodo-C

T (−110.900) event would result in a molecular mass difference of126.894. If the molecular mass of the base compositionA₂₇G₃₀5-Iodo-C₂₁T₂₁ (33422.958) is compared with A₂₆G₃₁5-Iodo-C₂₂T₂₀,(33549.852) the theoretical molecular mass difference is +126.894. Theexperimental error of a molecular mass measurement is not significantwith regard to this molecular mass difference. Furthermore, the onlybase composition consistent with a measured molecular mass of the 99-mernucleic acid is A₂₇G₃₀5-Iodo-C₂₁ T₂₁. In contrast, the analogousamplification without the mass tag has 18 possible base compositions.

TABLE 2 Molecular Masses of Natural Nucleobases and the Mass-ModifiedNucleobase 5-Iodo-C and Molecular Mass Differences Resulting fromTransitions Nucleobase Molecular Mass Transition Δ Molecular Mass A313.058 A-->T −9.012 A 313.058 A-->C −24.012 A 313.058 A-->5-Iodo-C101.888 A 313.058 A-->G 15.994 T 304.046 T-->A 9.012 T 304.046 T-->C−15.000 T 304.046 T-->5-Iodo-C 110.900 T 304.046 T-->G 25.006 C 289.046C-->A 24.012 C 289.046 C-->T 15.000 C 289.046 C-->G 40.006 5-Iodo-C414.946 5-Iodo-C-->A −101.888 5-Iodo-C 414.946 5-Iodo-C-->T −110.9005-Iodo-C 414.946 5-Iodo-C-->G −85.894 G 329.052 G-->A −15.994 G 329.052G-->T −25.006 G 329.052 G-->C −40.006 G 329.052 G-->5-Iodo-C 85.894 w

Example 6 Data Processing

Mass spectra of bioagent identifying amplicons are analyzedindependently using a maximum-likelihood processor, such as is widelyused in radar signal processing. This processor, referred to as GenX,first makes maximum likelihood estimates of the input to the massspectrometer for each primer by running matched filters for each basecomposition aggregate on the input data. This includes the GenX responseto a calibrant for each primer.

The algorithm emphasizes performance predictions culminating inprobability-of-detection versus probability-of-false-alarm plots forconditions involving complex backgrounds of naturally occurringorganisms and environmental contaminants. Matched filters consist of apriori expectations of signal values given the set of primers used foreach of the bioagents. A genomic sequence database is used to define themass base count matched filters. The database contains the sequences ofknown bacterial bioagents and includes threat organisms as well asbenign background organisms. The latter is used to estimate and subtractthe spectral signature produced by the background organisms.

A maximum likelihood detection of known background organisms isimplemented using matched filters and a running-sum estimate of thenoise covariance. Background signal strengths are estimated and usedalong with the matched filters to form signatures which are thensubtracted. The maximum likelihood process is applied to this “cleanedup” data in a similar manner employing matched filters for the organismsand a running-sum estimate of the noise-covariance for the cleaned updata.

The amplitudes of all base compositions of bioagent identifyingamplicons for each primer are calibrated and a final maximum likelihoodamplitude estimate per organism is made based upon the multiple singleprimer estimates. Models of all system noise are factored into thistwo-stage maximum likelihood calculation. The processor reports thenumber of molecules of each base composition contained in the spectra.The quantity of amplification product corresponding to the appropriateprimer set is reported as well as the quantities of primers remainingupon completion of the amplification reaction.

Example 7 Identification of Members of the Viral Genus Orthopoxvirus

DNA for five different test orthopoxvirus species from the laboratory ofDr. Chris Upton at University of Victoria, British Columbia, Canada:monkeypox (MPXV-VR267), cowpox (BR), rabbitpox (Utrecht), vaccinia (WR)and ectromelia (Moscow). PCR products corresponding to orthopoxvirusidentifying amplicons were generated according to Example 2 from each ofthe test viruses using primer pair nos: 296, 297, 299, 310, 312 and 313(Table 1). PCR products were purified according to Example 3 andanalyzed by mass spectrometry according to Example 4 with dataprocessing according to Example 6.

Spectra were processed by an algorithm that converts molecular mass tobase composition data. All detected masses could be unambiguously mappedto specific base compositions, which were compared to the pre-compileddatabase of expected products from each of these viruses. FIG. 3 (primerpair number 299) and FIG. 4 (primer pair number 297) show thedeconvoluted base compositions (solid cones) of the experimentallymeasured spectra in a three-dimensional plot (A, G, C axes, with the Tcounts represented by the tilt of the cone), overlaid on the expectedbase count distributions (hollow spheres) of the orthopoxvirus specieswhere sequence data was available. Compositions for the test strains areshown as a solid cone projected onto the same plot. The experimentallydetermined base compositions with compositions expected from thesequences in GenBank for all five viruses tested. Vaccinia andectromelia viruses gave expected products consistent with the databasesequence entry in each primer region. In the case of the rabbitpoxvirus, the sequence of the target region was identical to vaccinia virusin all primer regions selected and not distinguished by the primersdescribed above.

At the time of primer design, the only strain of monkeypox virusdeposited in GenBank was the Zaire 96_I-16 strain. The experimentallydetermined base compositions for the MPXVVR267 strain were differentfrom those for the Zaire strain. The experimentally determinedbased-counts were subsequently validated by comparison to the fullgenome sequence for the VR267 strain (unpublished results—Chris Upton,University of Victoria). Thus a new variant of a known orthopoxvirusspecies was identified with the same technology used for primarydetection, without the requirement of additional analysis and/or design.

A whole genome sequence for a new strain of cowpox, GRI-90 strain waspublished as these experiments were in progress. Analysis of severalconserved genes across all of the orthopoxvirus genera revealed thatthis strain was closer to vaccinia strains than it was to the previouslyknown Brighton Red strains of cowpox. The material that was tested inthe lab was clearly the BR strain as evidenced by the perfect match tothe expected base counts for these in the database.

Table 3 shows the expected base counts of the various orthopoxvirusspecies for all primer regions tested. The isolates used in this testare indicated. In every test instance, the experimentally measuredsignals matched database predicted base compositions. While a singleprimer target region might not resolve all species unambiguously,species can clearly be clearly identified and differentiated from oneanother using the triangulation strategy with multiple orthopoxvirusidentifying amplicons obtained from priming of different genetic loci.For example, primer pair no. 310 does not distinguish the CMS andM-92(2) strains of Camelpox virus but primer pair 296 does distinguishthese two strains because it produces two distinct base compositions.

TABLE 3 Orthopoxvirus Species Base Compositions for Primer Pair Nos:296, 297, 299, 310, 312 and 313 Orthopoxvirus Primer Primer PrimerPrimer Primer Primer Species and Pair Pair Pair Pair Pair Pair GenBankNo: 310 No: 296 No: 313 No: 299 No: 312 No: 297 Accession Strain [A G CT] Camelpox virus CMS A38 G11 A32 G20 A29 G15 A38 G23 A30 G19 A37 G17AY009089 C23 T19 C23 T33 C14 T26 C16 T30 C18 T33 C22 T22 Camelpox virusM-96 A38 G11 A32 G19 A29 G15 A38 G23 A30 G19 A37 G17 AF438165 C23 T19C23 T34 C14 T26 C16 T30 C18 T33 C22 T22 Cowpox virus Brighton A33 G14A36 G18 A29 G15 A36 G25 A25 G24 A36 G17 AF482758 Red C18 T26 C23 T31 C16T24 C17 T29 C21 T30 C22 T20 Cowpox virus GRI-90 A37 G11 A33 G19 A30 G15A36 G25 A27 G23 A36 G18 X94355 C24 T19 C24 T32 C13 T26 C17 T29 C19 T31C22 T22 Ectromelia Moscow A34 G13 A33 G19 A30 G15 A38 G25 A27 G22 A38G16 virus C17 T27 C24 T32 C13 T26 C15 T29 C19 T32 C22 T22 AF012825Monkeypox WR-267 A34 G14 A33 G20 A29 G15 A39 G24 A28 G20 A36 G17 virusC18 T25 C22 T33 C15 T25 C16 T28 C21 T34 C22 T20 AY603973 MonkeypoxZaire- A34 G14 A33 G20 A28 G16 A40 G24 A28 G20 A34 G19 virus 96-I-16 C18T25 C22 T33 C15 T25 C14 T29 C21 T34 C22 T20 AF380138 Vaccinia virusCopenhagen A38 G10 A32 G21 A30 G15 A37 G25 A25 G23 A38 G16 M35027 C24T19 C24 T31 C13 T26 C16 T29 C20 T31 C21 T23 Vaccinia virus Tian Tan A36G12 A32 G21 A30 G15 A37 G25 A27 G22 A38 G16 AF095689 C24 T19 C24 T31 C13T26 C16 T29 C19 T31 C21 T23 Vaccinia virus Western A36 G12 A33 G20 A30G15 A37 G25 A27 G23 A37 G17 AY243312 Reserve C24 T19 C23 T32 C13 T26 C16T29 C18 T32 C21 T23 Vaccinia virus Ankara A36 G12 A33 G20 A30 G15 A37G25 A25 G24 A38 G16 U94848 C24 T19 C23 T32 C13 T26 C16 T29 C20 T31 C21T23 Vaccinia virus Rabbitpox A36 G12 A33 G20 A30 G15 A37 G25 A25 G24 A37G17 AY484669 Utrecht C24 T19 C23 T32 C13 T26 C16 T29 C20 T31 C21 T23Variola major Bangladesh- A36 G11 A33 G20 A28 G16 A36 G23 A28 G21 A36G18 virus 1975 C24 T20 C20 T35 C14 T26 C15 T30 C16 T35 C21 T23 L22579Variola major India- A36 G11 A33 G20 A28 G16 A36 G23 A28 G21 A36 G18virus 1967 C24 T20 C20 T35 C14 T26 C15 T30 C16 T35 C21 T23 S55844Variola major Garcia- A36 G11 A34 G19 A28 G16 A36 G23 A28 G21 A36 G18virus 1966 C24 T20 C21 T34 C14 T26 C15 T30 C16 T35 C21 T23 Y16780

Various modifications of the invention, in addition to those describedherein, will be apparent to those skilled in the art from the foregoingdescription. Such modifications are also intended to fall within thescope of the appended claims. Each reference (including, but not limitedto, journal articles, U.S. and non-U.S. patents, patent applicationpublications, international patent application publications, gene bankaccession numbers, internet web sites, and the like) cited in thepresent application is incorporated herein by reference in its entirety.Those skilled in the art will appreciate that numerous changes andmodifications may be made to the embodiments of the invention and thatsuch changes and modifications may be made without departing from thespirit of the invention. It is therefore intended that the appendedclaims cover all such equivalent variations as fall within the truespirit and scope of the invention.

1-4. (canceled)
 5. A method for identification of an unknownorthopoxvirus comprising: amplifying nucleic acid from saidorthopoxvirus using the composition of claim 4 to obtain anamplification product; measuring the molecular mass of saidamplification product; optionally, determining the base composition ofsaid amplification product from said molecular mass; and comparing saidmolecular mass or base composition with a plurality of molecular massesor base compositions of known orthopoxvirus bioagent identifyingamplicons, wherein a match between said molecular mass or basecomposition and a member of said plurality of molecular masses or basecompositions identifies said unknown orthopoxvirus.
 6. A method ofdetermining the presence or absence of an orthopoxvirus species in asample comprising: amplifying nucleic acid from said sample using thecomposition of claim 4 to obtain an amplification product; determiningthe molecular mass of said amplification product; optionally,determining the base composition of said amplification product from saidmolecular mass; and comparing said molecular mass or base composition ofsaid amplification product with the known molecular masses or basecompositions of one or more known orthopoxvirus species bioagentidentifying amplicons, wherein a match between said molecular mass orbase composition of said amplification product and the molecular mass orbase composition of one or more known orthopoxvirus species bioagentidentifying amplicons indicates the presence of said orthopoxvirusspecies in said sample.
 7. A method for determination of the quantity ofan unknown orthopoxvirus in a sample comprising: contacting said samplewith the composition of claim 4 and a known quantity of a calibrationpolynucleotide comprising a calibration sequence; concurrentlyamplifying nucleic acid from said orthopoxvirus in said sample with thecomposition of claim 4 and amplifying nucleic acid from said calibrationpolynucleotide in said sample with the composition of claim 4 to obtaina first amplification product comprising an orthopoxvirus bioagentidentifying amplicon and a second amplification product comprising acalibration amplicon; determining the molecular mass and abundance forsaid orthopoxvirus bioagent identifying amplicon and said calibrationamplicon; and distinguishing said orthopoxvirus bioagent identifyingamplicon from said calibration amplicon based on molecular mass, whereincomparison of orthopoxvirus bioagent identifying amplicon abundance andcalibration amplicon abundance indicates the quantity of orthopoxvirusin said sample.
 8. The method of claim 7 further comprising repeatingsaid steps, wherein a different primer pair is used, wherein each memberof said different primer pair is of a length of 13 to 35 nucleobases andcomprises 70% to 100% sequence identity with the corresponding member ofany of the pairs of primers of SEQ ID NOs: 1:24, 2:25, 3:26, 5:28, 6:29,or 7:30.
 9. The method of claim 7 further comprising repeating saidsteps, wherein two different primer pairs are used, wherein each memberof said two different primer pairs is of a length of 13 to 35nucleobases and comprises 70% to 100% sequence identity with thecorresponding member of any of the pairs of primers of SEQ ID NOs: 1:24,2:25, 3:26, 5:28, 6:29, or 7:30.
 10. The method of claim 7 furthercomprising repeating said steps, wherein three different primer pairsare used, wherein each member of said three different primer pairs is ofa length of 13 to 35 nucleobases and comprises 70% to 100% sequenceidentity with the corresponding member of any of the pairs of primers ofSEQ ID NOs: 1:24, 2:25, 3:26, 5:28, 6:29, or 7:30.
 11. The method ofclaim 7 further comprising repeating said steps, wherein four differentprimer pairs are used, wherein each member of said four different primerpairs is of a length of 13 to 35 nucleobases and comprises 70% to 100%sequence identity with the corresponding member of any of the pairs ofprimers of SEQ ID NOs: 1:24, 2:25, 3:26, 5:28, 6:29, or 7:30. 12-18.(canceled)